To convert between DNA, RNA and protein sequences, the DNA
and RNA
molecules provide the methods
get_dna()
get_rna()
get_protein()
. These methods use the option
kwarg to define which sequence is to be converted:
option='coding'
, use sequence directly read from strand (default)option='complementary'
, use the complement of the coding sequenceoption='reverse_complementary'
, use the reverse complement of the coding sequenceBelow we show an example of get_dna()
method applied in three different modes to a DNA
sequence molecule.
In [1]:
from wc_rules.bioseq import DNA, RNA, Protein
inputstr = 'TTGTTATCGTTACCGGGAGTGAGGCGTCCGCGTCCCTTTCAGGTCAAGCGACTGAAAAACCTTGCAGTTGATTTTAAAGCGTATAGAAGACAATACAGA'
dna1 = DNA(ambiguous=False).set_sequence(inputstr)
print('1: '+ dna1.get_sequence(as_string=True))
print('2: '+ dna1.get_dna(option='coding',as_string=True))
print('3: '+ dna1.get_dna(option='complementary',as_string=True))
print('4: '+ dna1.get_dna(option='reverse_complementary',as_string=True))
Similar to get_sequence()
, the methods get_dna()
, get_rna()
and get_protein()
can operate on subsequences defined by (start
,end
) or (start
,length
).
For example, to get the reverse-complementary RNA coded in the first 66 bases of dna1
and instantiate a new RNA
molecule, do
In [2]:
seq = dna1.get_rna(option='reverse_complementary',start=0,length=66,as_string=True)
rna1 = RNA().set_sequence(seq)
print(rna1.get_sequence())
To get the protein sequence coded in the first 66 bases of dna1
and instantiate a new Protein
molecule, do
In [3]:
seq = dna1.get_protein(option='coding',start=0,length=66,as_string=True)
prot1 = Protein().set_sequence(seq)
print(prot1.get_sequence())